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Description 

Data processing system or communications terminal with 
a device for recognizing speech and method for 
recognizing certain acoustic objects _ ^ 



Devices and methods for recognizing natural 
speech are today familiar to a person skilled in the 
art from many different applications. The practical 
applicability and capacity of systems of this type 
depends very much on their complexity and the extent of 
their range of applications. The general principle 
applies that the recognition rate of such a system 
usually decreases greatly with an increasing number of 
acoustic objects to be recognized (words, phonemes, 
individual letters, etc.). At the same time, however, 
measured in terms of cost and space requirement but 
also with regard to training effort, the expenditure 
also usually increases greatly with the extent of 
applications . 

Conventional speech recognition systems are 
therefore still not used for many applications, 
although in principle they would be suitable for them 
from the viewpoint of the user. The invention is 
therefore based on the object of specifying a technical 
teaching which makes it possible for speech recognition 
to be used even for those applications where relatively 
great expenditure has to be ruled out for economic or 
other reasons. This object is achieved by a data 
processing system or communications terminal with a 
device for recognizing speech or by a method for 
recognizing certain acoustic objects according to one 
of the patent claims. 

The product according to the invention, a data 
processing system or communications terminal, has a 
device for recognizing speech which is set up 
specifically to recognize certain acoustic objects, to 
be specific individual letters, combinations of letters 
or control commands, or can be specifically configured 
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to recognize such objects. 
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The same applies correspondingly to the speech 
recognition algorithm of a method according to the 
invention. Furthermore, a device for the acoustic 
output or optical display of recognized acoustic 
objects is provided. In this way, the number or set of 
the acoustic objects to be recognized can be largely 
adapted to the intended application. The envisaged 
device for the acoustic output or optical display of 
recognized acoustic objects makes possible a direct 
feedback between the user and the system, providing the 
user with effective control over the recognition 
capacity and allowing the number of misrecognitions to 
be reduced in a simple but very effective way. 

If the user establishes a misrecognition on the 
basis of the acoustic output or optical display, he can 
repeat the acoustic input of the object to be 
recognized. Since this process possibly does not lead 
to correct recognition in a very short time, it is 
provided according to a preferred embodiment of the 
present invention that the speech recognition device is 
set up or can be configured in such a way that the 
recognition of a certain first control command has the 
effect following the output or display of an acoustic 
object of triggering the output or display of a further 
acoustic object. This enables the user after the 
output or display of an acoustic object, that is for 
example after an established misrecognition, to make 
the system output a further acoustic object by the 
acoustic input of a special acoustic object, to be 
specific a control command. 

If, for example for a selection {AOl, A02 , 
AOn} of possible acoustic objects, the device for 
speech recognition or the speech recognition algorithm 
determines recognition probabilities {pi/ p2 , pn} 
with the property 1 > pi >= p2 >=, ... , >= Pn > 0, 
this preferred 
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embodiment makes possible, for example, the output or 
display of A02 after the output of the misrecognized 
object A01, or similar measures for supporting a 
correction of the recognition error that is as 
convenient as possible for the user. A possible 
selection for such a special acoustic object or such a 
control command would be, for example, the word 
"incorrect". It is not difficult for a person skilled 
in the art to consider on the basis of this description 
further application possibilities for this embodiment 
of the present invention. 

Further preferred embodiments of the present 
invention are the subject of further subclaims. 

The invention is explained in more detail below 
on the basis of preferred exemplary embodiments with 
the aid of figures. 

Figure 1 shows in a schematic way the structure 
and mode of operation of a preferred embodiment of a 
system according to the invention. 

As represented in figure 1, this embodiment of 
a data processing system (DPCD) or communications 
terminal (DPCD) according to the invention comprises a 
speech recognition unit (SRU) , which recognizes 
acoustic objects (AO) spoken by a user of the system 
and feeds the recognized acoustic objects (RAO) to a 
device for acoustic output or optical display (DU) . 
According to the present invention, the speech 
recognition device is set up specifically to recognize 
certain acoustic objects (AO), to be specific 
individual letters, combinations of letters or control 
commands, or can be configured specifically to 
recognize such objects. 

The speech recognition device consequently 
assigns to an acoustic object (AO) spoken by the user 
in each case an acoustic object recognized by this 
device (RAO) . Since the recognition of natural speech 
is always 
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subject to a certain uncertainty for fundamental 
reasons, the recognized acoustic object will generally 
be, depending on the speech recognition algorithm used, 
the most probable or most plausible acoustic object 
that comes into consideration, taking into account the 
determined features of the spoken acoustic object. 

The user receives via the output or display 
device (DU) an acknowledgement message concerning the 
result of the recognition process. He then has the 
possibility of responding to this according to the type 
of result involved. If the acoustic object was 

misrecognized, he has the possibility of notifying the 
speech recognition algorithm that the acoustic object 
has not been correctly recognized, or that he wanted to 
have a different object recognized, by saying a control 
command intended for this purpose, for example the word 
"again". He then has the opportunity to say once again 
the object desired by him. This process can be 
continued until the speech recognition unit recognizes 
the desired object. 

The input of another control command, for 
example the word "incorrect", could control the speech 
recognition algorithm in such a way that a further 
acoustic object is output, preferably that object of 
which the probability or plausibility is admittedly 
lower than that of the object previously output but 
greater than that of all the other objects coming into 
consideration. In this case, it would not be necessary 
for the user to say the object again; instead, further 
candidates would continue to be offered for the object 
to be recognized until the user no longer inputs the 
corresponding control command or possibly inputs an 
expressly confirmatory command, for example "correct". 

According to a further preferred embodiment, it 
is possible to provide a control command, for example 
the word "continue", which, when recognized following 
the speaking or display of an acoustic object, has the 
effect of triggering the display or output of an 
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object which follows the former object in a certain 
sense. The sequence of the objects does not in this 
case have to be fixed by the magnitude of recognition 
probabilities or plausibility values but may also be 
dictated by the sequence of entries in a memory unit 
(MU) of the system, or by alphabetical sequences of 
objects or sequences of objects semantically defined 
within a defined context. For example, the sequence of 
objects could be defined by the order within a 
database, a telephone directory or the structure of a 
file stored in the memory unit, for example a customer 
file, a dictionary, or similar files. 

When this patent application mentions devices 
which are set up or can be configured for a certain 
function or mode of operation, this means that the 
corresponding functional features of these devices may 
be permanently or temporarily restricted. Furthermore, 
these devices can be set up or configured by all those 
involved between the manufacturer and the user by 
manufacturing processes, settings on the hardware or 
the use or parameterization of software or equivalent 
means or measures for a certain function or mode of 
operation. A person skilled in the art will readily 
deduce from this description numerous similar or 
equivalent means or measures for this purpose. 

A speech recognition device is preferably set 
up or configured by a suitable selection or 
parameterization of the software which realizes the 
desired function in the speech recognition algorithm 
and/or the sequence control of this device. A data 
memory is preferably set up or configured by a suitable 
selection or parameterization of the data structure, 
for example the database structure, which defines the 
type of storage of the data on this memory and the type 
of access to these data. 
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The effective recognition capacity of the 
system can be distinctly improved by the recognition of 
an acoustic object or a sequence of objects which 
corresponds or correspond to an entry in the data 
memory having the effect of triggering the display or 
output of this entry (ME) or a function (FU) of the 
system associated with this entry. As a result, the 
existing prior knowledge of the objects likely to be 
recognized can be utilized very advantageously. 
Although this technique is known in principle to a 
person skilled in the art, it is particularly 
effective, as appropriate tests have shown, in 
connection with a speech recognition system specially 
designed to recognize a limited set of objects to be 
recognized, for example individual letters. 

So if, for example, the first three letters of 
an entry in a telephone directory are recognized, a 
preferred embodiment of the invention provides the 
output or display of this telephone directory entry. 
If it is not the desired entry, it may be sufficient to 
input (i.e. say) a control command or a few further 
control commands, such as for example "continue" or 
"street" or "fax number" or "connect", to achieve on 
the basis of, for example, the name of a subscriber 
known to the user the output of the latter 1 s fax number 
or the dialing of this number by the communications 
terminal by saying the first three initial letters of 
his name. Other functions which could be triggered in 
this way, such as for example the output of a text or 
image, the display of a data record, etc., are so 
numerous that it is not possible to list them here. 

The capacity of the systems or methods which 
realize the present invention can be further increased 
by providing certain control commands, such as for 
example "letter", "control" or "combination", etc., the 
speaking of which enables the user to restrict the set 
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objects to be recognized according to his choice 
(temporarily or permanently) to a certain subset, such 
as for example individual letters, combinations of 
letters or control commands. 
5 With the present invention, in particular the 

number of telephone entries which can be called up by 
voice selection in a mobile telephone or cordless phone 
or in a wire-bound telephone can be increased at will. 
In the case of customary systems of this type, only a 

10 limited number of entries was allowed for voice 
selection, from experience at most 20 or 30 entries. 
This was due to the memory space to be made available 
for the voice samples to be re-recognized, i.e. due to 
the resultant costs and space requirement. If the 

15 number of entries was further increased, experience 
showed that the effort for training the speech 
recognition increased considerably, which led to lower 
user acceptance. 

According to a preferred embodiment of the 

2 0 present invention, the speech recognition algorithm is 
trained by the user only for the letters of the 
alphabet, and possibly combinations, and just a few 
control commands. It is in this way set up or 

appropriately configured by the user for the 

25 recognition of these acoustic objects. Interrogation 
takes place by the acoustic input of initial letters 
and (preferably up to two) subsequent letters. 
Misrecognitions are reduced by plausibility checks, 
i.e. for example by comparison of the objects with 

30 entries in a memory device. The names input are spoken 
only once and converted in an encoder with a low bit 
rate (for example half rate of GSM) and stored at the 
corresponding memory location, possibly in a compressed 
form. 

35 Alternatively, a synthesis program which 

synthesizes voice from a name may also be used, 
possibly requiring less memory space. In any event, 
the speech 
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recognition does not have to be trained for a large 
number of names but only for a fixed set of 
approximately 3 0 sequences of letters and control 
commands . 

5 To use this embodiment of the invention, the 

user activates the service feature "voice selection", 
for example by means of the scroll key at the side, and 
inputs the first letters of the entry sought, possibly 
in the form "letter A" etc. Experience shows that the 

10 probability of recognition is considerably greater in 
this case than in the case of a single letter. Each 
input is acoustically acknowledged by the recognized 
object being output. If the object was correctly 
recognized, the next object to be recognized is input. 

15 If an object is recognized wrongly, the user 

responds with "incorrect" or "no". The system then 
proposes the next probable letter, for example instead 
of "D" a "T" or instead of "H" an "A" and so on. In 
most cases, it is sufficient to input the first two or 

20 three letters to find the correct entry. If a 

corresponding control command is input or no further 
input takes place (control command = pause in speech) , 
the terminal outputs the corresponding name in the 
telephone directory of the terminal. If there are a 

25 number of entries with the same initial sequence of 
letters, the user issues, for example, the command 
"continue", until the "correct" name is acknowledged. 

If a letter is recognized wrongly and, as a 
consequence, a first letter that is remote in the 

30 alphabet - for example "T" instead of "D" - is output 
as the beginning of the input combination of letters, 
the user inputs (i.e. speaks) the control command 
"selection". The terminal then proposes the most 
probable next correct combination of initial letters. 

3 5 Knowledge of the names stored in the telephone 
directory allows most possible wrong combinations to be 
ruled out from the outset. After that, the user issues 
the command "dial". 
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1. A data processing system (DPCD) or 
communications terminal (DPCD) with a device (SRU) for 

5 recognizing speech having the following features: 

a) the speech recognition device is set up specifically 
to recognize certain acoustic objects (AO), to be 
specific individual letters, combinations of letters or 
control commands, or can be configured specifically to 

10 recognize such objects; 

b) a device for the acoustic output (DU) or optical 
display (DU) of recognized acoustic objects (RAO) is 
provided . 

2. The system as claimed in claim 1, the speech 
15 recognition device (SRU) of which is set up or can be 

configured in such a way that the recognition of a 
certain first control command has the effect following 
the output or display of an acoustic object of 
triggering the output or display of a further acoustic 
2 0 object. 

3 . The system as claimed in one of the preceding 

claims, having a data memory (MU) which is set up or 
can be configured in such a way that the recognition of 
an acoustic object or a sequence of objects which 

2 5 corresponds or correspond to an entry in the data 

memory has the effect of triggering the display or 
output of this entry (ME) or a function (FU) of the 
system associated with this entry. 

4. The system as claimed in claim 3, in which the 

3 0 recognition capacity is improved by a comparison of 

possible objects or object sequences with existing 
entries in the data memory (MU) . 
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5 . The system as claimed in one of the preceding 
claims, the speech recognition device of which can be 
brought with the aid of certain control commands into 
specific operating states for the recognition of 

5 individual letters, combinations of letters or control 
commands . 

6. A method for recognizing certain acoustic 
objects, in which 

a) a speech recognition algorithm which is set up 
10 specifically to recognize certain acoustic objects, to 

be specific individual letters, combinations of letters 
or control commands, or can be configured specifically 
to recognize such objects is used; 

b) recognized acoustic objects are acoustically output 
15 or optically displayed. 

7. The method as claimed in claim 6, which is set 
up or can be configured in such a way that the 
recognition of a certain first control command has the 
effect following the output or display of an acoustic 

2 0 object of triggering the output or display of a further 
acoustic object. 

8. The method as claimed in one of the preceding 
method claims, which is set up or can be configured in 
such a way that the recognition of an acoustic object 

2 5 or a sequence of objects which corresponds or 
correspond to an entry in the data memory has the 
effect of triggering the display or output of this 
entry or a function of the system associated with this 
entry . 
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9. The method as claimed in one of the preceding 

method claims, in which the recognition capacity is 
improved by a comparison of possible objects or object 
sequences with existing entries in the data memory. 
5 10. The method as claimed in one of the preceding 

method claims, the speech recognition algorithm of 
which can be brought with the aid of certain control 
commands into specific operating states for the 
recognition of individual letters, combinations of 
10 letters or control commands. 



GR 98 P 4724 



Abstract 

Data processing system or communications terminal with 
a device for recognizing speech and method for 
recognizing certain acoustic objects 

Small devices with database functionality, for 
example mobile telephones with a telephone directory 
function, can be controlled with the aid of a 
simplified speech recognition device which is specially 
designed intentionally for the recognition of control 
commands and individual letters or combinations of 
letters. This makes it possible for the recognition 
capacity to be improved and allows larger databases to 
be used with less demands on the capacity of the 
hardware . 
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der Vereinigten Staaten, Paragraph 122 offenbart ist, 
erkenne ich gemass Absatz 37, Bundesgesetzbuch, 
Paragraph 1.56(a) meine Pflicht zur Offenbarung von 
Informationen an, die zwischen dem Anmeldedatum 
der fruheren Anmeldung und dem nationalen oder PCT 
internationalen Anmeldedatum dieser Anmeldung 
bekannt geworden sind. 


] hprphv Haim thp hpnpfit nnrlpr Titlp 3^ t JnitpH fitatp^ 
Code. §120 of any United States application(s) listed 
below and, insofar as the subject matter of each of the 
claims of this application is not disclosed in the prior 
United States application in the manner provided by 
the first paragraph of Title 35, United States Code, 
§122, I acknowledge the duty to disclose material 
information as defined in Title 37, Code of Federal 
Regulations, §1 .56(a) which occured between the filing 
date of the prior application and the national or PCT 
international filing date of this application. 


PCT/DE99/00068 


14.01.1999 




(Application Serial No.) 
(Anmeldeseriennummer) 


(Filing Date D, M, Y) 
(Anmeldedatum T, M, J) 


(Status) (Status) 
(patentiert, anhangig, (patented, pending, 
aufgegeben) abandoned) 


(Application Serial No.) 
(Anmeldeseriennummer) 


(Filing Date D.M.Y) 
(Anmeldedatum T, M; J) 


(Status) (Status) 
(patentiert, anhangig, (patented, pending, 
aufgeben) abandoned) 


Ich erklare hiermit, dass alle von mir in der vorliegen- 
den Erklarung gemachten Angaben nach meinem 
besten Wissen und Gewissen der vollen Wahrheit 
entsprechen, und dass ich diese eidesstattliche Erkla- 
rung in Kenntnis dessen abgebe, dass wissentlich und 
vorsatzlich falsche Angaben gemass Paragraph 1 001 , 
Absatz 18 der Zivilprozessordnung der Vereinigten 
Staaten von Amerika mit Geldstrafe belegt und/oder 
Gelangnis bestraft werden koennen, und dass derartig 
wissentlich und vorsatzlich falsche Angaben die Gui- 
tigkeit der vorliegenden Patentanmeldung oder eines 
darauf erteilten Patentes gefahrden konnen. 


I hereby declare that all statements made herein of my 
own knowledge are true and that all statements made 
on information and belief are believed to be true, and 
further that these statements were made with the 
knowledge that willful false statements and the like so 
made are punishable by fine or imprisonment, or both, 
under Section 1001 of Title 18 of the United States 
Code and that such willful false statements may 
jeopardize the validity of the application or any patent 
issued thereon. 
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Patent and Trademark Office-U.S. DEPARTMENT OF COMMERCE 



German Language Declaration 



VERTRETUNGSVOLLMACHT: Als benannter Erfinder 
beauftrage ich hiermit den nachstehend benannten 
Patentanwalt (oder die nachstehend benannten 
Patentanwalte) und/oder Patent-Agenten mit der 
Verfolgung der vorliegenden Patentanmeldung sowie 
mit der Abwicklung aller damit verbundenen Geschafte 
vor dem Patent- und Warenzeichenamt: (Name und 
Registrationsnummer anfuhren) 




POWER OF ATTORNEY: As a named inventor, I 
hereby appoint the following attorney(s) and/or 
agent(s) to prosecute this application and transact all 
business in the Patent and Trademark Office 
connected therewith, (list name and registration 
number) 



And 1 hereby appoint 



Telefongesprache bitte richten an: 
(Name und Telefonnummer) 



Direct Telephone Calls to: 
number) 

Ext. 



(name and telephone 



Postanschrift: 



Send Correspondence to: 

MORRISON AND FOERSTER LLP 
2000 PENNSYLVANIA AVE, NW 20006 -1888 WASHINGTON, DC 
— Telephone^ — 





Voller Name des einzigen Oder urspriinglichen Erfinders: 

FRIFDRIOH MUELLER 



Unterecr/j/t des Erfinders /? /J ^ t. 
Wohnsitz * 



Full name of sole or first inventor: 

FRIEDRICH MUELLER 



Datum 



Inventor's signature 



Date 



MUENCHEN, yEUTSCHLAND 



Residence 

MUENCHEN, GERMANY 



Staatsa ngeRSrtgkeit -* 

DE 



Citizenship 

DE 



Postanschrift 



maxi loroTn 9m- TEUT0NEN5TR: 



Post Office Addess 



maxi ioro-rn 74/o TEVTOUE NSTR 4lH 



01475 MUENCI ICN » 



01475 MUCNOI I CN X) 



Voller Name des zweiten Miterfinders (falls zutreffend): 



Full name of second joint inventor, if any: 



Unterschrift des Erfinders 



Datum 



Second Inventor's signature 



Date 



Wohnsitz 



Residence 



Staatsangehdrigkeit 



Citizenship 



Postanschrift 



Post Office Address 



(Bitte entsprechende Informationen und Unterschriften im (Supply similar information and signature for third and 
Falle von dritten und weiteren Miterfindern angeben). subsequent joint inventors). 
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